WO 2006/092658 



PCT/IB2005/003652 



APPARATUS AND METHOD FOR DETECTING OBJECTS 

BACKGROUND 

[0001] The present invention pertains to the technical field of object detection, and in 

particular to techniques for detecting a moving object in front of a vehicle. 

(0002] Japanese Kokai Patent Application No. 2001-84497 discloses a position- 

detecting device in which multiple objects are extracted from the images captured by an 
onboard camera, and based on the variation over time of the position in y-direction i.e., 
height of the objects, correction of the y-coordinates of the objects is performed taking into 
consideration the pitching and other behavior of the vehicle. As a result, it is possible to 
detect the position of each object by excluding the influence of the pitching and other 
behavior of the vehicle. However, because detection of the y-coordinate is performed based 
on the variation over time of the position of the object in y-direction, it is difficult to detect 
variation in the position of the object due to pitching that takes place due to passengers, cargo, 
etc. 

SUMMARY 

[0003] In accordance with one aspect of the invention, an apparatus is provided for 

detecting the position of an object in one or more images captured by an image pickup device 
mounted on a vehicle. The apparatus includes a memory on which is stored a plurality of 
images captured by the image pickup device, including a first image of an object taken at a 
first time when the vehicle is balanced and a second image of the object captured at a second 
time; and a controller operatively coupled to the memory and adapted to determine whether 
the second image was captured when the vehicle was balanced, and to determine the position 
of the object in the second image based on the position of the object in the first image if the 
second image was captured when the vehicle was not balanced. 

[0004] In accordance with another aspect of the invention, a method is provided for 

detecting the position of an object in an image captured by an image pickup in a vehicle The 
method includes determining whether a first image of an object captured by an image pickup 
was captured when the vehicle was balanced; and determining the position of the object in 
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the first image if the first image was captured when the vehicle was not balanced, which 
determination is based on a second image of the same object that was captured when the 
vehicle was balanced. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0005] The description herein makes reference to the accompanying drawings 

wherein like reference numerals refer to like parts throughout the several views, and wherein: 
[0006] Figure 1 is a block diagram of a device for detecting objects in accordance 

with a first embodiment of the invention. 

[0007] Figure 2A is a diagram of the position of an object relative to a vehicle in 

which the object-detecting device of Figure 1 has been installed, when the vehicle in a 
balanced state. 

[0008] Figure 2B is a diagram of the position of an object relative to a vehicle in 

which the object-detecting device of Figure 1 has been installed, when the vehicle is pitching. 
[0009] Figure 3 is a graph over time of angle Bp pf vehicle pitching, the velocity in 

the y-direction of the object in images captured by the device of Figure 1, and the 
acceleration in the y-direction of the object in the images. 

[0010] Figure 4 A is a diagrammatic side elevation of the position of an object in the 

path of a vehicle in which the object-detecting device of Figure 1 has been installed. 
[001 1] Figure 4B is a diagrammatic overhead plan view of the position of an object in 

the path of a vehicle in which the object-detecting device of Figure 1 has been installed 
[001 2] Figure 5 is a flow chart of the operation of the object-detecting device shown 

in Figure 1 . 

[0013] Figure 6 is a diagram illustrating computation of image velocity by the object- 

detecting device of Figure 1. 

DETAILED DESCRIPTION 

[0014] In an embodiment of the invention described below, a judgment is made as to 

whether the picked-up image was captured when the vehicle itself was balanced; if it is 
judged that the picked-up image was captured when the vehicle itself was not balanced, the 
position of the object present in the picked-up image is computed based on the information 
about the object computed from the picked-up image captured when the vehicle itself was 
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balanced. As a result, even when pitching takes place due to passengers or cargo, it is 
determined that the picked-up image was captured when the vehicle itself was not balanced, 
and it is possible to compute the position of the object in the image correctly. 

[0015] Figure 1 is a block diagram of a first embodiment of the invention. The 

object-detecting device 100 is mounted on board the vehicle, and it includes camera 1, image 
memory 2, microcomputer 3, and display unit 4. The camera 1 is arranged at the front of the 
vehicle, and it takes pictures at a constant time interval At. The image memory 2 converts 
each image captured with camera 1 to digital data and stores it. The microcomputer 3 reads 
the digital image stored in image memory 2. As will be explained later, the size of the object 
in real space and the distance to the object are detected, taking into consideration the pitching 
of the vehicle. The display unit 4 displays the detected object in a bird's-eye-view mapped 
downward onto a map with the vehicle itself at the center. 

[00 1 6] For purpose of illustrating the first embodiment of the invention, it is assumed 

that in the image captured with camera 1, only one moving object, such as another vehicle, is 
present, and the real-space size of the object present in the image has a width that can be 
detected on the image. 

[0017] Figures 2A and 2B show an example of change in the position of the object 

present the front of the vehicle in the picked-up image when the vehicle is in balance (Figure 
2A) and when pitching of the vehicle takes place (Figure 2B). As shown in Figure 2 A, 
assuming the deviation angle to be Go when the object is viewed from the camera vision axis 
9 with respect to the horizontal direction, y-coordinate value yo in the picked-up image when 
the vehicle is balanced to be defined later is computed using the following Equation 1 : 

yo = a6o % * * ( 1 ) 

[00 1 8] Here, a is a constant that can be uniquely computed from the image pickup 

element size, focal distance, etc., of camera 1 hereinafter to be referred to as the "camera 
parameters". 
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[001 9] In this case, when pitching of the vehicle takes place, as shown in Figure 2B, 

and the pitching angle in this case is 8p, y-coordinate yo' of the object in the image is 
computed using the following Equation 2: 

yo^aOo^a ( 9o + 0p ) = y o +a8p- • * ( 2 ) 

[0020] That is, the change in the y-coordinate of the object in the image when 

pitching takes place can be seen to be proportional to the size of pitching angle 0p. 

[0021] Figure 3 shows the relationship over time of pitching angle 0p and the y-axis 

acceleration of the object in the image, that is, the vertical acceleration of the image 
acceleration. 

[0022] As shown in Figure 3, characteristic curve 3a shows the up/down periodic 

movement periodic movement over time due to pitching when pitching takes place; 
characteristic curve 3b shows the change in image velocity over time image velocity; and 
characteristic curve 3c shows the change in image acceleration over time. As shown in 
Figure 3, when image acceleration 3c is zero, image velocity 3b is maximum or minimum, 
and periodic movement 3a is at the inflection point. Because the inflection point of periodic 
movement 3a shows the point at which the vehicle is balanced, it is possible to judge that an 
image with an image acceleration 3c of zero is one captured when the vehicle itself was 
balanced when pitching of the vehicle itself took place. 

[0023] Also, in this embodiment, edge extraction processing for the picked-up image 

allows the well-known gradient method and block matching method to be adopted to 
compute the optical flow, and the velocity of the object present in the image is detected. As a 
result, the image velocity 3b and image acceleration 3c are detected. The edge extraction 
processing and the optical flow computation processing are well-known technologies, and are 
explained below in connection with Figure 6. 

[0024] Based on the image for which image acceleration 3c in the y-direction is zero 

for the object detected on the image, that is, the image captured when the vehicle itself was 
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balanced image, it is possible to compute the y-coordinate of the object in the image, to be 
explained later, the width of the object in real space (the size of the object), and the distance 
from camera 1 to the object, that is, distance D from the focal position of camera 1 to the 
object. Figures 4 A and 4B show an example of the situation in which the image captured 
when the vehicle is balanced, is used to compute the y-coordinate of the object in the image, 
the width of the object in real space, and the distance between the focal position of camera 1 
and the object. Figure 4A shows a side view of the object, and Figure 4B shows an top view 
of the object. 

[0025] As shown in Figure 4A, using vision axis 9 of camera 1 and apparent angle of 

the object Go that can be detected in the image, the y-coordinate of the object in the image can 
be computed using the following Equation 3. 

y =a ( e + eo } • • • ( 3 ) 

[0026] Also, assuming the pre-measured camera mounting height to be H, the vision 

axis of camera 1 to be 0, and the apparent angle of the object to be Go, distance D from the 
focal position of camera I to the object is D can be computed using the following Equation 4. 

D = H / t an ( G + Oo ) • • • ( 4 ) 

[0027] Width Ws of the object is then computed. As shown in Figure 4B, based on 

the relationship between image width xw and lateral angle Gx of the object, the following 
Equation 5 is obtained. 

x w =39** ' • ( 5 ) 

Here, p is a constant that can be uniquely computed from the camera parameters. 

[0028] Consequently, using Equation 5, one can compute width Ws of the object 

using the following Equation 6. 

Ws=8x- D = xw/p- D- • • (6) 
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[0029] Consequently, based on the image balanced, it is possible to compute the y- 

coordinate of the object, width Ws of the object, and distance D from the focal position of 
camera 1 to the object using Equations 3, 4 and 6. 

[0030] However, when pitching of the vehicle takes place often, the vision axis 0 of 

camera 1 changes according to the pitching angle, and the magnitude of the change is unclear. 
Consequently, in this case, it is impossible to compute the y-coordinate of the object, width 
Ws of the object, and distance D from the focal position of camera 1 to the object using the 
processing. Consequently, for an image for which image acceleration 3c is not zero, these 
can be computed as follows. 

[003 1] First of all, the balanced images, captured at different times Tl and T2, of the 

presence of the object judged to be the same object in the image for which image acceleration 
3c is not zero, are read from image memory 2. Whether the objects detected in the images 
with a non-zero image acceleration 3c are the same object can be judged by checking whether 
they have a similar velocity in the images and a similar shape after edge extraction processing 
and detection. Also, camera 1 in this embodiment is a high-speed camera, and it takes 
consecutive pictures from the front of the vehicle at a minute prescribed time interval At, 
such as 2 ms. The precondition is that the balanced images captured at different times Tl and 
T2 must contain the same object detected in the image with non-zero image acceleration 3c. 
The following explanation is given based on this precondition. 

[0032] It is possible to represent the distances Dl and D2, between the focal position 

of camera 1 and the object in the images balanced at times Tl and T2, with the following 
Equations 7 and 8 using Equation 4 by utilizing the apparent angles Ool and 6o2 of the object 
at the times, respectively. 



D 1 = H / tan (O + Oo 1 ) • • • { 7 ) 
D2 = H / tan (e + eo 2 ) • • • ( 8 ) 
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[0033] Also, widths Ws of the object at times T I and T2 can be represented with the 

following Equations 9 and 10 using Equation 6 from the lateral angle exl and 9x2 of the 
object at the times, respectively. 

Ws =0x 1 • D 1 • • • ( 9 ) 
Ws = 0x 2 ■ D 2 • • • (10) 

[0034] As a result, by substituting Equations 7 and 8 into Equations 9 and 10, 

respectively, one can obtain the following Equations 1 1 and 12: 

W s = G x 1 • H/tan(9 + eo1) - • • (11) 
Ws=Gx2; H / tan(6 + 0o2) - * • ( 12) 

[0035] In Equations 1 1 and 12, when an image from far the front of the vehicle is 

captured with on-board camera 1, it is possible to set approximately 0 ~ 0, 0ol ~ 0, and 9o2 ~ 
0. Consequently, in Equation 1 1, one has tan 0+-0ol -> 0 + 0ol. In Equation 12, one has tan 
0+0o2 -> 9 + 0o2. As a result, Equations 1 1 and 12 are represented by following Equations 
13 and 14, respectively. Based on this relationship, one can obtain the following Equations 
15 and 16: 

Ws =0x 1 • H / (9 + eo 1 )♦ • • ( 13) 
Ws =8x 2 • H / (8 + 6o2 )• ■ * ( 14) 
Ws • (0 + Oo 1 ) =0X 1 • H • • • (15) 
Ws- (0 + 0o2)=0x2- H • • • (16) 

[0036] Here, by subtracting Equation 16 from Equation 15, the following Equation 17 

can be obtained, and it is possible to compute width Ws of the object. 

Ws = H- ( 9x 1 -0x 2 ) / (8o 1 -9o 2 ) • • • (17) 

[0037] As a result, even when pitching of the vehicle takes place, and the vision axis 

0 of camera 1 for the images captured in this case becomes unknown, it is still possible to 
compute width Ws of the object from the camera mounting height H that can be detected in 
the two images balanced at different times for the same object, as well as from 001, 002, 0x1 
and 0x2 that can be measured from the images. 
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[0038] Based on the width Ws of the object computed in this case, distance D from 

the focal position of camera 1 to the object in the images captured during the pitching state, it 
is possible to derive the following Equation 1 8 using Equation 6. 

D=Ws/9x =Ws • p / x w • • (18) 

[0039] As a result, by means of Equation 4, vision axis 8 of camera 1 can be 

computed with the following Equation 19 using distance D from the focal position of camera 
1 to the object. 

0= a t a n ( H / D ) -Go- - • (19) 

[0040] As a result, even when there is change in vision axis 9 of camera 1 when 

pitching of the vehicle takes place, it is still possible to compute width Ws of the object and 
vision axis 6 of camera 1 based on the images captured at different times Tl and T2. Since 
vision axis 9 of camera 1 has been computed, it is possible to compute the y-coordinate of the 
object in the image using Equation 3. 

[0041] Based on the y-coordinate in the image computed using the processing, for 

example, it is possible to mark the object on a bird's-eye-view map displayed on display 4. 
As a result, even when pitching of the vehicle takes place, it is still possible to reliably correct 
for the deviation of the object in the y-direction in the image due to pitching, and to display 
the obtained result in a bird's-eye-view map. 

[0042] Figure 5 is a flow chart illustrating the process of object-detecting device 1 00 

in this embodiment. The processing shown in Figure 5 is carried out as follows: the ignition 
switch of the vehicle is turned ON, the power supply for the object-detecting device is turned 
ON, and the program is started that is used to execute the processing with microcomputer 3. 
In step S10, the picked-up images captured with camera I and stored in image memory 2 are 
read, and process flow then continues to step S20. 
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[0043] In step S20, edge extraction processing is performed on any image that is read 

as described above to compute the optical flow. As a result, image velocity 3b and image 
acceleration 3c are computed, and process flow continues to step S30. In step S30, 
determination is made as to whether the computed image acceleration 3c is zero. If it is 
determined that the computed image acceleration 3c is zero, the read image was judged to 
have been captured when the vehicle was balanced, and process flow continues to step S40. 
In step S40, as explained above, the y-coordinate of the object, width Ws of the object, and 
distance D from the focal position of camera 1 to the object are computed by means of the 
Equations 3, 4 and 6. 

[0044] Figure 6 is a diagram illustrating computation of image velocity and image 

acceleration at step S20. Referring to Figure 6, a captured image 16 is shown moving 
toward the left. A observation region 18 of captured image 16 includes a background region 
20, an edge region 22 and an object region 24. The luminosity (1) of regions 20, 22 and 24 
vary with respect to position (x) as well as time (t), since the object in image 16 is moving 
leftward. In Figure 6, luminosity of these regions is plotted (curves 26) against position (x) 
at time Tl (solid line) and T2 (dotted line). The curves 26 represented by luminosity at times 
Tl and T2 are shifted because of the left-ward movement of the image. 

[0045] To compute velocity, for each of the each of the pixels extracted as the edge, 

variation of luminance in space in the longitudinal direction in the prescribed region is 
computed (the slope dl/dx as shown in Figure 6). Then, for each of the pixels extracted as the 
edge, variation of luminance over time between the prescribed frames is computed (dl/dt as 
shown in Figure 6). For each of the pixels extracted as the edge, from the variation of 
luminance in space and the variation of luminance overtime, the image velocity in the 
longitudinal direction is computed based on the following formula: 

dl/dx * v + dl/dt = 0 [20] 

[0046] Also, from the variation of the image velocity over time computed in the 

above, the image acceleration is computed. 

[0047] On the other hand, when it is determined that the computed image acceleration 

3c is not zero, it is judged that the read image was captured when pitching was taking place, 



WO 2006/092658 PCTYIB2005/003652 

and process flow continues to step S50. In step S50, the images captured in the balanced 
state at different times Tl and T2 and containing the same object as the object detected in the 
picked-up image are read from image memory 2. Process flow then continues to step S60. 
Width Ws of the object in real space, distance D from the focal position of camera 1 to the 
object, and vision axis 9 of camera 1 are then computed by means of Equations 17, 18, and 19. 
Then, process flow continues to step S70, and the y-coordinate of the object is computed 
using Equation 3. 

[0048] Then, process flow continues to step S80. In this step, based on the y- 

coordinate of the object and width Ws of the object in real space, the detected object is 
mapped on a bird's-eye-view map, and this is displayed on display unit 4. Process flow then 
continues to S90. In step S90, a judgment is made as to whether the ignition switch of the 
vehicle is OFF. If it is not OFF, flow returns to step S10 and the process is repeated. If it is 
OFF, the processing comes to an end. 

[0049] In the present embodiment explained above, the following features can be 

realized. 

[0050] The image acceleration (3c of Figure 3) of the image of the object is computed, 

and the image for which image acceleration 3c for the image of the object is found not to be 
zero is judged to be an image captured when the vehicle was pitching. As a result, it is 
possible to detect the occurrence of pitching without carrying a device for detecting the 
posture of the vehicle or another device for detecting pitching, so that the cost of the device 
can be reduced with this constitution. 

[005 1 ] For the image captured when no pitching of the vehicle takes place, that is, for 

an image captured in the balanced state, distance D from the focal position of camera i to the 
object is computed based on camera mounting height H, vision axis 9 of camera 1, and 
apparent angle of the object 9o, and object width Ws can be computed based on image width 
xw, object lateral angle 9x, and the distance from the focal position of camera 1 to the object. 
As a result, the distance to the object and the size of the object can be detected without any 
need for a dedicated sensor, and the device can be realized with a simple constitution. 
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[0052] When pitching of the vehicle takes place, width Ws of the object in real space 

is computed based on the images captured in the balanced state at different times Tl and T2 
determining that the object is the same as that detected in the image when pitching took place, 
and distance D from the focal position of camera 1 to the object and vision axis 9 of camera 1 
are computed based on the computing result. As a result, even when vision axis 0 of camera 
1 is uncertain when pitching takes place, it is still possible to correctly compute the vision 
axis 9 of camera 1, the distance to the detected object, and the size of the detected object. 

[0053] Also, when pitching of the vehicle takes place, the y-coordinate of the object 

in the image is computed based on vision axis 9 of camera 1 and the object is mapped on the 
bird's-eye-view map and displayed on display unit 4. As a result, even when pitching of the 
vehicle takes place, it is still possible to reliably correct for deviation in the y-direction in the 
image of the object due to pitching, and to display the corrected result on the bird's-eye-view 
map. 

[0054] The foregoing embodiment has been described in order to allow easy 

understanding of the present invention, and does not limit the present invention. On the 
contrary, the invention is intended to cover various modifications and equivalent 
arrangements included within the spirit and scope of the appended claims, which scope is to 
be accorded the broadest interpretation so as to encompass all such modifications and 
equivalent structures as is permitted under the law. 

[0055] For example, in the embodiment, in order to detect image velocity 3b and 

image acceleration 3c, edge extraction processing is performed on the picked-up image, and 
the optical flow is computed. However, other schemes can be adopted to detect image 
velocity 3b and image acceleration 3c. 

[0056] Also, in the embodiment, the image for which image acceleration 3c is zero is 

judged to be an image captured when the vehicle itself was balanced. However, it is also 
possible to judge that an image for which image velocity 3b is plus or minus and image 
acceleration 3c is zero is an image captured when the vehicle itself was balanced. In this way, 
even when the characteristics on the extension side and those on the contraction side are 
different due to the vehicle suspension, it is still possible to correctly detect the balance state 
of the vehicle. 



-11- 



WO 2006/092658 PCT/IB2005/003652 



[0057] In the embodiment, as an example, it is assumed that there is only one moving 

object in the image captured with camera 1 . However, the present invention is not limited to 
this scheme. For example, a scheme can be adopted in which multiple moving objects are 
present in the image. In this case, the processing is performed for all of the objects detected 
in the image, and the y-coordinate of an object, width Ws of the object in real space, and 
distance D from the focal position of camera 1 to the object are then computed. 

[0058] In the embodiment, as an example, when pitching of the vehicle takes place, 

vision axis 0 of camera 1 becomes unclear, so that width Ws of the object in real space is 
computed, and distance D from the focal position of camera J to the object, vision axis 0 of 
camera 1, and the y-coordinate of the object in the image are computed. However, the 
present invention is not limited to this scheme. For example, a scheme can also be adopted in 
which even when mounting position of camera 1 deviates and vision axis 9 of camera 1 
becomes unclear, the method is used to compute width Ws of the object in real space, 
distance D from the focal position of camera 1 to the object, vision axis 9 of camera 1, and 
the y-coordinate of the object in the image. 

[0059] In the embodiment, as an example, the detected object is mapped on a bird's- 

eye-view map for display on display unit 4. However, the present invention is not limited to 
this scheme. For example, a scheme can be adopted in which the object is mapped on a 
planar map or on another type of map for display. 

[0060] In the following claims, the camera 1 corresponds to the image pickup means, 

and microcomputer 3 corresponds to the image judgment means, object position computing 
means, acceleration computing means, velocity computing means, and object information 
computing means. 

[0061] This application is based on Japanese Patent Application No. 2004-351086, 

filed December 3, 2004 in the Japanese Patent Office, the entire contents of which are hereby 
incorporated by reference.. 

[0062] Also, the above-mentioned embodiments have been described in order to 

allow easy understanding of the present invention, and do not limit the present invention. On 
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the contrary, the invention is intended to cover various modifications and equivalent 
arrangements included within the spirit and scope of the appended claims, which scope is to 
be accorded the broadest interpretation so as to encompass all such modifications and 
equivalent structures as is permitted under the law. 
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